02-06/12/2019
<<<<<<< HEAD ## What is R? (Ben)
Object oriented programming language -Designed to make data handling and statitics intuitive.
## Warning: Removed 450 rows containing missing values (geom_path).
For this training, we will use:
install.packages("package_name") then hit Enter key
Run the following code to install all the necessary packages
## # A tibble: 6 x 6 ## country continent year lifeExp pop gdpPercap ## <fct> <fct> <int> <dbl> <int> <dbl> ## 1 Afghanistan Asia 1952 28.8 8425333 779. ## 2 Afghanistan Asia 1957 30.3 9240934 821. ## 3 Afghanistan Asia 1962 32.0 10267083 853. ## 4 Afghanistan Asia 1967 34.0 11537966 836. ## 5 Afghanistan Asia 1972 36.1 13079460 740. ## 6 Afghanistan Asia 1977 38.4 14880372 786.
Different packages/functions to import data from different file formats
Swiss-army knife for data import/export: rio
Different packages for different DBMS, e.g.:
E.g.: Importing data from MariaDB
| Package | Function | Use |
|---|---|---|
| dplyr | select | select variables/columns |
| dplyr | filter | select observations/rows |
| dplyr | mutate | transform or recode variables |
| dplyr | summarize | summarize data |
| dplyr | group_by | identify subgroups for further processing |
| tidyr | gather | convert wide format dataset to long format |
| tidyr | spread | convert long format dataset to wide format |
(Interactive activity)
gmx. This will be gm, but we will filter to include only the most recent year (2007).gmx. This will be gm, but we will filter to include only the most recent year (2007).gmx. This will be gm, but we will filter to include only the most recent year (2007).ncountries. To do this, group the data by continent and tally the number of countriesncountries. To do this, group the data by continent and tally the number of countriesncountries from lowest to highestncountries from lowest to highestmoz. This will be the gapminder data, but just for Mozambique.moz. This will be the gapminder data, but just for Mozambique.mutate a new variable with the average GDP for that continent, mutate another variable with the difference between each country and its continent’s average GDPgapminder data?